Maximum information measurement for qubit states

We determine the optimal measurement that maximizes the average information gain about the state of a qubit system. The qubit is prepared in one of two known states with known prior probabilities. To treat the problem analytically we employ the formalism developed for the maximum confidence quantum state discrimination strategy and obtain the POVM which optimizes the information gain for the entire parameter space of the system. We show that the optimal measurement coincides exactly with the minimum-error quantum measurement only for two pure states, or when the two states have the same Bloch radius or they are on the same diagonal of the Bloch disk.


Maximum information measurement for qubit states
Árpád Varga 1 , Peter Adam 1,2* & János A. Bergou 1, 3 We determine the optimal measurement that maximizes the average information gain about the state of a qubit system.The qubit is prepared in one of two known states with known prior probabilities.To treat the problem analytically we employ the formalism developed for the maximum confidence quantum state discrimination strategy and obtain the POVM which optimizes the information gain for the entire parameter space of the system.We show that the optimal measurement coincides exactly with the minimum-error quantum measurement only for two pure states, or when the two states have the same Bloch radius or they are on the same diagonal of the Bloch disk.
In quantum information the carriers of information are quantum systems and information is encoded in their states.Extracting this information is a central problem in quantum information processing and it can be done by determining the state via measurements 1 .
In many quantum communication schemes, information is the state itself.In these schemes a sender, Alice, prepares an ensemble of quantum systems, each in a state from a set of n known states, {ρ j |j = 1, . . ., n} , the letter states.The weight of state ρ j in this initial ensemble is η j , called the a priori probability or simply prior.Alice then randomly draws a system from this initial ensemble and sends it to the receiver, Bob.The set of possible states as well as their priors are also known to the receiver whose task is to identify the state of the system he received.If the states are mutually orthogonal the task is easy: Bob sets up detectors along these orthogonal directions and a click in one of them will perfectly determine the input.However, if the possible states are not mutually orthogonal, the problem is highly nontrivial.Bob needs to choose a figure of merit and find a measurement which is optimal with this respect.Accordingly, several strategies have been developed with respect to various criteria.Optimization, in general, leads to complex measurement strategies often involving generalized measurements.Some of the frequently employed strategies are discrimination with minimum error (ME) [2][3][4][5][6] , unambiguous discrimination (UD) [7][8][9][10][11][12] , and maximum confidence (MC) [13][14][15][16] discrimination.
3][4] for two states (pure or mixed) with arbitrary priors.In this strategy, every time Bob receives a system he has to make a guess about its state based on the outcome of his measurement.The price to pay is that errors must be allowed.In the optimal strategy the average probability of error is minimized.The ME strategy involving more than two states is known in some special cases only.
8][9] for two pure states with equal priors and was later generalized for arbitrary prior probabilities in Ref. 10 .In the UD strategy, no errors are allowed.The price to pay is that Bob must be allowed to return inconclusive answers.In the optimal strategy the average probability of inconclusive answers is minimized.An important result states that UD is possible if the states are linearly independent 17 , which is not a requirement for ME.The UD strategy is successively used in sequential state discrimination, which is a strategy for N separate receivers [18][19][20][21][22] .
We note that each strategy has its own advantages and drawbacks when we try to apply them for a general measurement problem.It is difficult to find measurements realizing unambiguous discrimination for mixed states, but it is relatively easy to generalize this strategy for more than two states, at least in principle 23 .The ME strategy handles mixed states and pure states on equal footing but is hard to generalize for more than two states, except for some special, highly symmetric cases 24,25 (although progress has been made recently in this area 26,27 ).
Another independent strategy, called Maximum Confidence (MC), was introduced in Ref. 13 .The aim of the MC strategy is to construct a measurement which maximizes the confidence C j : the conditional probability that detector j clicks provided that the state ρ j was prepared.In the case of linearly independent states this strategy coincides with the UD strategy.However, when the states to be discriminated are not linearly independent, this is an independent strategy 28 .For further developments in this line of state discrimination studies we refer to the recent reviews [14][15][16] .
www.nature.com/scientificreports/In quantum communication one can look for a measurement strategy maximizing the mutual information between the communicating parties.In this problem the sender sends a sequence of individual quantum systems, each taken from a given set of known states E = {ρ j |j = 1, . . ., n} , and the receiver measures them one by one, possibly by a POVM with the POVM elements m where m = 1, . . ., N .Our task is to maximize the Shannon mutual information 2,3,29 , between the measurement outcomes m and the input states j, where η j is the a priori probability of the state j, and p(m|j) = Tr(� m ρ j ) is the conditional probability of getting the measurement result m, given that the state ρ j was prepared.
For a given set of states and their a priori probabilities, the problem is to find a measurement which maximizes the mutual information.While the mutual information is known to obey the Holevo bound 30 , it is important to determine the accessible information, which is the actual maximum over all possible measurements.This is a special problem in state discrimination: we want to maximize the correlation between state preparation and measurement outcomes, i.e., we want to devise a measurement strategy that yields maximum information about which state was prepared.We will refer to such a measurement as maximum information (MI) measurement.Solution to this problem is known only for a few special cases.Even determining the amount of information which can be encoded in a given quantum system is a nontrivial task [31][32][33][34][35][36] .We note that the problem we address consists in maximizing mutual information between classical random variables linked via a quantum encoding-decoding scheme.Therefore, it is related to the calculation of channel capacities.It is, however, different from that of quantum channel capacity concepts maximizing quantum (as opposed to classical) mutual information [37][38][39][40][41] .
The number of POVM elements needed to maximize the information gain is in general unknown.It is known 42 that for any ensemble in d dimensions there is an optimal strategy with at most N elements, where d ≤ N ≤ d 2 .Sasaki et al. 43 showed that for the case of real states (that is, states with a real density operator), the upper bound is d(d + 1)/2 .They also gave explicit solutions for the case of real and symmetric states in two dimensions, and showed that at most three POVM elements are necessary.Levitin 44 conjectured that if the number of the possible states is N ≤ d , the optimal measurement will always be a von Neumann measurement.This conjecture was proven to hold for two pure states in arbitrary dimensions by Levitin.However, Shor 45 gave a counterexample involving three real pure states in three dimensions (qutrits).Considering qubits only, Keil 46 proved that von Neumann measurements are always optimal for two general states.
Fuchs and Peres 47 studied numerically the trade-off between the information gain and the measurement induced disturbance.Ban et al. 48gave analytic results for pure binary signal states, and showed the connection between the ME measurement and the MI measurement for this special case.Řeháček et al. 49 gave an iterative algorithm to find the optimal POVM for the accessible information and illustrated the method on an example in three dimensions.There are also lower 50 and upper 51,52 bounds to the accessible information for simple cases, which depend explicitly only on the message ensemble.
In this paper we consider the problem of finding the optimal measurement to maximize the mutual information for a general qubit system.Our approach makes use of the method developed for the maximum confidence strategy and leads to analytical insight.In particular, we determine the POVM in parametric form with a single parameter, which maximizes the information gain for the entire parameter space of the system.

Information gain and confidence probabilities: General formalism
We begin with a study of the simplest case of two qubit states in a two-dimensional Hilbert space and present an alternative derivation of Eq. (1) for this case.We cast the result to a form that shows the intrinsic connection of information gain with the confidence probabilities, introduced in Ref. 13 .
Recall that we consider a two-party protocol with a sender, Alice, and a receiver, Bob.Alice prepares an ensemble of qubit systems where each qubit is either in the state ρ 1 or in the state ρ 2 .The first state is prepared a fraction η 1 of the time and the second state is prepared a fraction η 2 = 1 − η 1 of the time.The ensemble is described by the density matrix Alice then randomly draws a quantum system from this ensemble and sends it over to Bob.
A more elaborate but equivalent way of describing state preparation is as follows.Alice initially prepares an ensemble of two-qubit states and sends particle b over to Bob.Then she performs a measurement in the computational basis on the particle in her possession.If she finds the result |0� she knows that Bob's particle is in the state ρ 1 and if she finds the result |1� she knows that Bob's particle is in the state ρ 2 .If this is repeated a large number of times, Bob will receive the state ρ 1 a fraction η 1 of the time and the state ρ 2 a fraction η 2 of the time, on average.
Either way, Bob has no knowledge of the actual state he received, all he knows are the prior probabilities, η 1 and η 2 = 1 − η 1 .The initial information uncertainty is given by and the initial information is (1) The question we are addressing here is: How much information can Bob gain by performing measurement(s) on the system he received?To this end we will consider the following general model of quantum measurement.In accordance with the results described in the Introduction, we assume that Bob has N detectors described by the set of rank-1 operators {� m |m = 1, . . ., N} with d ≤ N ≤ d 2 adding up to the identity operator, The latter condition ensures that given the measured system in any state, one of the detectors will click.The conditional probability that m clicks, given a system in the state ρ j is calculated using Born's rule: In order to have positive probabilities we have to require the positivity (non-negativity) of the detection operators, Equations ( 5) and ( 7) define a Positive Operator Valued Measure (POVM), which is simply the decomposition of the identity in terms of positive operators, called the elements of the POVM.
Next, we use Bayesian updating, employing Bayes' theorem, P(m|j)P(j) = P(j|m)P(m) , for conditional probabilities.In particular, we apply this formula for the situation when j = ρ j and m = m .Then P(j) = η j is the prior probability of state j, P(m|j) is the detection probability given in Eq. ( 6), i.e., detector m records an event if state j is given, and is the total probability that detector m records an event.Using Eq. ( 2), the last expression can be written as Thus we find that the conditional probability P(ρ j |� m ) , the probability that if detector m records an event it is due to the state j, can be written as C jm is the confidence (or confidence probability) which is the central quantity in the MC state discrimination strategy.
Equipped with the confidences we next give the information uncertainty for the case when detector m clicks.Clearly, using The average uncertainty is where P(m) is given by Eq. ( 8).The information after the measurement is given by Finally, the information gain from the measurement can be given as where S i is given by Eq. ( 3) and S f is given by (11).Substituting the explicit expressions for the various quantities obtained so far into Eq.( 12) it can be shown that this equation is identical to Eq. (1).In this formulation the contributions from the prior and posterior information appear clearly separated.Furthermore, S i is constant for a given set of priors, so it is independent of the measurement we perform.Therefore, optimizing the information gain is equivalent to finding the POVM that minimizes the second term, the information uncertainty S f , Eq. (11).As noted before, the information gain that is maximized over all possible measurements is also called the accessible information.In the next section we develop a fully analytical theory that provides the accessible information (optimal solution) in parametric form for all values of the parameters.

Accessible information for qubits
To treat the optimization problem effectively, we employ the formalism developed for the maximum confidence strategy in Ref. 13 .Equations ( 11) and ( 12) are expressed in terms of the confidence probabilities, so they provide a convenient starting point.The method yields the optimal solution analytically in parametric form, in terms of a single parameter.
As the first step we introduce transformed density and measurement operators by the definitions: (4) www.nature.com/scientificreports/and where ρ is defined in Eq. ( 2).The transformed states satisfy It follows from this expression that the transformed states ρ1,2 have the same set of eigenvectors.Using the transformed operators we can write the confidence C jm , Eq. ( 9), in the more compact form, We wish to maximize the information gain (12) [or the final uncertainty S f (11)] due to the measurement.S f is already in terms of the confidences while the outcome probability becomes in terms of the transformed operators.Using Eqs. ( 14) and ( 15) it is easy to show from Eq. ( 5) that the transformed measurement operators ¯ m satisfy the equation: We note that the transformation m → ¯ m is rank preserving.Thus they become rank 1 projectors, not necessarily orthogonal.All we can say is that, as seen from Eq. ( 19), they correspond to a pure state decomposition of ρ , not necessarily in terms of orthogonal pure states.Furthermore, a pair of qubit states are always unitarily equivalent to a pair of real states, their Bloch vectors can be chosen to span the x, z plane of the Bloch sphere.Therefore, we can assume that ¯ m is also real.The general expression of a real rank 1 matrix can be written as For the calculation that follows it is convenient to use the common eigenvectors of ρ1 and ρ2 as basis.Let the eigenvectors be |1� and |2� , and the eigenvalues of ρ1 be 1 and 2 .Eq. ( 16) immediately gives that the eigenvalues of ρ2 are 1 − 1 and 1 − 2 .Substituting (20) first into (17), we obtain Without loss of generality we can assume 1 ≥ 2 from where 1 ≥ C 1m ≥ 2 and 1 − 2 ≥ C 2m ≥ 1 − 1 follow.Substituting Eq. ( 20) next into Eq.( 19), we find Here, ρ ij are the matrix elements of ρ in the basis formed by the eigenstates of ρ1 .
Up to this point our consideration is general as we have not imposed any restriction on the number N of POVM elements.However, it has been proven that for a pair of qubit states the optimal measurement is projective 44,46 , that is, N = 2 .Then m = P m , where {P m |m = 1, 2} are rank 1 orthogonal projectors, P m P m ′ = P m δ mm ′ ( m, m ′ = 1, 2 ).Therefore, from now on, we deal with the case of two orthogonal detectors and use the notation since in this case we want to identify a click in detector m = j with ρ j .Hence C j is the probability of "good" events for the corresponding detector.
Using this notation in (21) and then the resulting expression in ( 22) and ( 23), some lengthy but straightforward algebra yields ( 14) www.nature.com/scientificreports/Substituting (27) into Eq.( 11) and then using the resulting expression in (12) we obtain which is one of our central results.It expresses the information gain I entirely in terms of the confidences C 1 and C 2 , and the prior probabilities η 1 and η 2 .Remarkably, this expression is independent of the structure of the states to be discriminated.Equations ( 22) and ( 23) together with Eq. ( 21) allowed us to express the information gain entirely in terms of the confidence probabilities.The remaining Eq. ( 24) represents the main constraint under which (28) should be optimized.From Eqs. ( 24) and ( 27) it is easy to obtain the relation The first term on the left-hand side is a function of quantities related to state 1 alone, while the second term is a function of quantities related to state 2 alone.Therefore, they separately must be equal to a universal function of the parameters, which we denote by a.The function a still depends on the parameters of the problem but not on C 1 and C 2 and, in order to satisfy (29), it must be antisymmetric under the exchange 1 ↔ 2 .In terms of a we can write This provides a straightforward analytical solution to the entire problem.Substituting α 1 and α 2 from Eqs. (21), we obtain the constraint in parametric form.It expresses C 1 and C 2 and hence the information gain in terms of the single parameter a.
More importantly, however, it leads to a visual geometric solution which is the central result of this paper.It can serve as guide to find the exact solution for any values of the parameters specifying the problem.We introduce the geometric approach in the next subsection and illustrate its power on several examples.

Geometric optimization
First, we introduce a convenient parametrization of the problem.Recall that two qubit states are always unitarily equivalent to two real states: the corresponding two Bloch vectors span a plane in the Bloch sphere, and this plane can always be unitarily rotated to the x − z plane.We can thus restrict our discussion to this plane, also termed as the Bloch disk.With a further rotation, the Bloch vector of one of the states, say ρ 1 can be aligned with the z axis.So we assume, without loss of generality, that our states are real from the beginning and the Bloch vector r 1 of the first state is along the z axis, that is, we use the following parametrization of the states: Here, 1 is the two-dimensional identity operator, σ x and σ z are Pauli matrices, r i is the Bloch radius, and θ i is the polar angle of state ρ i , measured from the z axis.The parameters are shown in Fig. 1 where 0 < r 1 , r 2 ≤ 1 , θ 1 = 0 , 0 ≤ θ 2 ≤ π .In the following, we use these parameters to present our results.
For given fixed values of the parameters, that is, the prior probabilities η 1 and η 2 and the eigenvalues of the transformed states, 1 and 2 , the constraint (29) (or its parametric version , (30)) can be easily plotted in the C 1 -C 2 plane.This gives us a unique 8-shaped curve on which we have to find the optimal values of C 1 and C 2 .To this end, we notice that the information gain expression, (28), for a fixed value of I is also a curve in the same plane.If we choose the fixed value I too large, the two curves do not intersect.Lowering the value of I , for a certain threshold value the two curves become tangent.This value is the maximal information gain I max available by the measurement, that is, the accessible information.The procedure is illustrated in Fig. 2. We should also mention that the values �I < �I max correspond to feasible (suboptimal) measurements, all the way to C 1 = C 2 = 0.5 , which corresponds to pure guessing.
Figure 3 shows two examples for the geometric optimization, that is, the constraint (30) and the information gain (28) plotted together in the C 1 , C 2 plane, for two sets of parameters of the input states.The figure shows that increasing the prior probability η 1 of the state ρ 1 increases the optimal confidence probability C 1 of the state while reducing the confidence C 2 of the other state.
We find numerically that the optimal values are C 1 = 0.4879 and C 2 = 0.9469 for the left panel, while C 1 = 0.7867 and C 2 = 0.8338 for the right panel.It is interesting to note that in the first case both detectors identify ρ 2 with larger confidence.
In order to interpret these results, we point out that Figs. 2 and 3 are symmetric under reflection about the C 1 + C 2 = 1 line.This property follows from the fact that the information gain, Eq. ( 28), and the constraint, Eq. ( 29), are invariant under the substitution C 1 ↔ 1 − C 2 .In particular, the constraint which is represented by the 8-shaped dashed line in these plots has this symmetry and the point where it intersects itself has coordinates C 1 = η 1 and C 2 = η 2 .These values correspond to pure guessing with no actual measurement performed and ( 27) Vol:.( 1234567890) www.nature.com/scientificreports/using them in Eq. ( 28) leads to I min = 0 .As noted before, Eq. ( 28) also gives a relation between C 1 and C 2 for a fixed value of the information gain I .When plotted in the C 1 − C 2 plane, it exhibits two disjoint segments that are related by the reflection symmetry about the C 1 + C 2 = 1 line.The optimal measurement corresponds to the points where the solid line, Eq. ( 28), is tangent to the dashed line, Eq. ( 30).It can be seen that there are two sets of solutions, related by the same symmetry.Feasible measurements are in the region bounded by the two solid lines, yielding a value I in the range �I max > �I > �I min = 0 as we approach, from either of the boundaries, the point C 1 = η 1 and C 2 = η 2 where I min = 0 .It should also be noted that, as a consequence of Eq. ( 21), knowledge of 1 and 2 , the eigenvalues of the transformed states in Eq. ( 16), is sufficient to find the optimum measurement.
Parametrization of real states: r 1 and r 2 are the Bloch radii measured from the origin, θ 2 is the polar angle relative to the z axis ( θ 1 = 0 ).R is the Euclidean distance of the states ρ 1 and ρ 2 , and φ is the angle between r 1 and R.  28) (dotted lines) plotted together in the C 1 , C 2 plane, for various fixed suboptimal values of the information gain ( I A = 0.18 , I B = 0.3 , I C = 0.37 ).The solid line corresponds to the maximal information gain, I max = 0.23129 .Optimal values of C 1 and C 2 are the coordinates of the point where the solid line is tangent to the dashed curve (note that there are two sets of optimal solutions).The values of the parameters are η 1 = η 2 = 1/2 , r 1 = 0.9 , θ 1 = 0 , and r 2 = 0.5 , θ 2 = π/4.www.nature.com/scientificreports/In summary, there are two key points of the geometric approach to optimization.First, the constraint (30), linking C 1 and C 2 , restricts us to a curve in the C 1 − C 2 plane, and the maximum of the information gain has to be found along this curve.Second, for a fixed value of I (such that 0 < �I < 1 ), the expression for the information gain, Eq. ( 28), also corresponds to a curve in the same plane.If we choose I too large the two curves may not have any common points.For intermediate values the two curves may have more than one common point.The maximal value I max of the information gain is the one for which the two curves become tangent.Geometrically, it corresponds to the unique value of I for which the �I(C 1 , C 2 ) curve becomes tangent to the constraint.This can still happen for more than one point and the coordinates of these points, C 1 and C 2 , all correspond to optimal measurements, however, the value of I max is unique.The actual value can be found numerically.Then we substitute the C i values corresponding to the tangent points back to (25) and (26) and, using Eqs.(20), (15)  and (14), we arrive at the POVM(s) which yield(s) the maximum information about the system (MI POVM).

Comparison of the MI and ME strategies
Although initially it has been assumed that the minimum error and the maximum information measurements coincide, a careful numerical study revealed that, in general, they are different 49 .Therefore, in the following we present a systematic study of how these two measurements compare.Anticipating the results, we find that the ME and MI measurements coincide for the case of two pure states and, generally, for the case when the two states have the same Bloch radius ( r 1 = r 2 ) or they are on the same diagonal of the Bloch disk.
In the ME strategy one is looking for measurement operators i that maximize the expression P S is the average probability of correctly guessing the input state, aided with the measurement.Introducing and using 1 + 2 = 1 , we can write Eq. ( 32) in a more compact form, This expression is clearly maximal if 1 is the projector to the subspace of belonging to positive eigenvalues.Hence, 2 will be the projector to the subspace belonging to negative eigenvalues.In order to show the relationship between the ME and MI measurements, we first write Eq. ( 32) in the form which can be obtained if we divide and multiply the first term on the right-hand side of Eq. ( 32) by P 1 and the second by P 2 and use the definition Eq. ( 8) for the click probabilities and Eq. ( 9) for the confidences.It has been observed earlier 28 that the minimum error measurement is also the one that maximizes the average confidence.Furthermore, the relation (27) between the click probabilities and the confidences still holds, so finally, we can write P S as (32)  www.nature.com/scientificreports/This is the cost function to be maximized in the ME measurement.When we compare this to the cost function, Eq. ( 28), of the MI measurement, we see that there are similarities, e.g., the click probabilities are the same and, in addition, the constraint ( 29) is also the same for both measurements.Apart from these similarities, the two cost functions are rather different.So, there is no a priori reason for the ME and MI measurements to be the same.We will show that they are generally different indeed except for special cases when the input state has some intrinsic symmetry.What we know is that they are both projective measurements in the xz plane.After determining their respective orientations, the difference between the ME and MI strategies can be characterized by the angle δ between their projectors Next, we study the dependence of δ on the structure of the input states.The MI and ME POVMs are determined by using the methods presented previously.Without loss of generality, we will choose the parameters of ρ 1 as r 1 and θ 1 = 0 and study the dependence of δ on the parameters of ρ 2 , that is, on the polar angle θ 2 and the Bloch radius r 2 in the Bloch disk, introduced in Fig. 1.In this paper, we focus on the case of equal priors, η 1 = η 2 = 1/2 ; the case of arbitrary priors will be addressed in a separate publication.Note that, in the following figures, δ is measured in degrees, while the polar angles are measured in radians.
We consider the general case of two mixed states.Without loss of generality, we assume that r 2 ≤ r 1 .Figure 4a shows the difference δ between the two measurement strategies as a function of the Bloch radius r 2 and the polar angle θ 2 characterizing the state ρ 2 .Figure 4b shows the same quantity as a function of the polar angle θ 2 , for representative values of the Bloch radius r 2 .In these figures ρ 1 is a fixed mixed state with r 1 = 0.8 , θ 1 = 0 .From these figures one can deduce that the two strategies coincide (that is, δ = 0 ) only in the case when the two mixed states have the same Bloch radius ( r 1 = r 2 ) or they are along the same diagonal of the Bloch disk ( θ 2 = 0, π ).These rules are valid for any value of r 1 .Accordingly, the ME and MI strategies coincide for two pure states ( r 1 = r 2 = 1 ).For a given r 2 , δ exhibits a maximum, δ max , for a certain value of the polar angle θ 2 = θ max 2 .Figure 5 displays the polar angle θ max 2 , corresponding to the maximum difference δ max (r 2 ) between the two strategies, as a function of the Bloch radius r 2 , for the pure state ρ 1 = |0��0| ( r 1 = 1 , θ 1 = 0 ).In this figure, the value of the polar angle θ max 2 decreases linearly, except for values of r 2 close to those of the other Bloch radius r 1 .Note that the value r 2 = 1 for which the two strategies coincide is excluded from the domain of the function θ max 2 (r 2 ) . Figure 4b shows that by increasing the Bloch radius r 2 toward the other radius r 1 the value of the maximal difference δ max (r 2 ) between the ME and MI detection strategies also increases.We have found that the function δ max (r 2 ) practically reaches its saturated value δ max within a precision 0.01 when r 1 − r 2 0.001 .Figure 6a presents the maximum difference δ max between the ME and MI strategies as a function of the purity r 2 1 of the state ρ 1 .The state ρ 2 is always close to ρ 1 ( r 1 − r 2 = 0.001 ), that is, R → 0 in Fig. 1.The figure shows that increasing the purity r 2 1 of state ρ 1 , δ max grows nearly exponentially, reaching its maximum when ρ 1 is pure.Figure 6b shows the POVMs for this case in the computational basis.The ME POVM is aligned symmetrically around the states.For the MI strategy, however, one of the POVM elements virtually coincides with the pure state, and the other one is perpendicular to rule it out.Although the information provided by the measurements (36) The difference δ between the ME and MI measurement strategies for the case of two mixed states as a function of (a) the Bloch radius r 2 and the polar angle θ 2 characterizing the state ρ 2 , and (b) the polar angle θ 2 , for representative values of r 2 .ρ 1 is a fixed mixed state at r 1 = 0.8 , θ 1 = 0 .The difference δ is measured in degrees, while the polar angle θ 2 is measured in radians.
vanishes when the mixed state ρ 2 is in the close vicinity of the pure state ρ 1 , we have found that the MI POVM brings more than twice as much information as the ME POVM for the presented case.

Discussion
We have developed an analytic method, supplemented by a geometric approach to optimization, for finding the measurement that yields the maximum information gain about a qubit system that is prepared in one of two known states with given prior probabilities.We have determined the parameters of the POVM of the maximum information gain measurement for two arbitrary (pure or mixed) states, prepared with equal prior probabilities, building on previous results that the optimal measurement is always a standard von Neumann measurement for this case.We have compared the maximum information measurement to the minimum error one, and showed that the POVMs of the two measurement strategies coincide exactly only when both states have the same Bloch radius or they are along the same diagonal of the Bloch disk.The case of general priors will be addressed in a subsequent publication.for which the difference δ between the ME and MI strategies is maximal, as a function of the Bloch radius r 2 for pure state ρ 1 = |0��0| ( r 1 = 1 , θ 1 = 0 ).The polar angle θ max 2 is measured in radians.

Figure 4 .
Figure 4.The difference δ between the ME and MI measurement strategies for the case of two mixed states as a function of (a) the Bloch radius r 2 and the polar angle θ 2 characterizing the state ρ 2 , and (b) the polar angle θ 2 , for representative values of r 2 .ρ 1 is a fixed mixed state at r 1 = 0.8 , θ 1 = 0 .The difference δ is measured in degrees, while the polar angle θ 2 is measured in radians.

Figure 6 .
Figure 6.(a) The maximum difference δ max between the ME and MI strategies as a function of the purity r 21 of ρ 1 .The state ρ 2 is always close to ρ 1 ( r 1 − r 2 = 0.001 ). (b) The POVMs of the MI (thick vectors) and ME (thin vectors) strategies for two states which are very close to each other on the Bloch disk ( ρ 1 = |0��0| , r 2 = 0.999 , θ 2 = 0.001).